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Abstract 

Background: During the domestication of crops, individual plants with traits desirable for human needs have been 
selected from their wild progenitors. Consequently, genetic and nucleotide diversity of genes associated with these 
selected traits in crop plants are expected to be lower than their wild progenitors. In the present study, we 
surveyed the pattern of nucleotide diversity of two selected trait specific genes, Wx and OsCl, which regulate 
amylose content and apiculus coloration respectively in cultivated rice varieties. The analyzed samples were 
collected from a wide geographic area in Northeast (NE) India, and included contrasting phenotypes considered to 
be associated with selected genes, namely glutinous and nonglutinous grains and colored and colorless apiculus. 

Results: No statistically significant selection signatures were detected in both Wx and OsClgene sequences. 
However, low level of selection that varied across the length of each gene was evident. The glutinous type varieties 
showed higher levels of nucleotide diversity at the Wx locus (n tot = 0.0053) than nonglutinous type varieties 
(n tot = 0.0043). The OsCl gene revealed low levels of selection among the colorless apiculus varieties with lower 
nucleotide diversity (n tot = 0.0010) than in the colored apiculus varieties (n tot = 0.0023). 

Conclusions: The results revealed that functional mutations at Wx and OsC/genes considered to be associated with 
specific phenotypes do not necessarily correspond to the phenotypes in indigenous rice varieties in NE India. This 
suggests that other than previously reported genomic regions may also be involved in determination of these 
phenotypes. 
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Background 

The domestication of plants and animals is considered 
as one of the most important events in the human his- 
tory that increased the food security to support increas- 
ing human population. The process of domestication 
involves selection of individuals from wild progenitors to 
fulfill human needs [1]. The Asian cultivated rice is one 
of the earliest domesticated crop species selected for 
many traits relevant for human consumption and large- 
scale agriculture. The most important domestication re- 
lated traits and corresponding genes identified so far in 
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rice with significant morphological and physiological 
modifications include reduction in grain shattering [2,3], 
changes in grain coloration [4], grain size and shape [5], 
grain fragrance and flavor [6], grain number [7], grain 
weight [8] and grain stickiness [5]. The genes that con- 
trol these traits are often called 'domestication genes' in 
crop plants. In addition to human mediated selection for 
specific traits, the environment where crops grown also 
may have played a major role in selection and changes 
in genetic diversity of crop plants. 

Domestication is often associated with reduction in 
genetic variation in domesticated plants as compared to 
their wild progenitors [1]. This is mainly due to popula- 
tion bottlenecks and artificial selection of domestication 
genes for desirable traits. Domesticated plants are a 
product of relatively small founder populations, in which 
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only a sub-sample of the wild progenitor population 
contributes to the genomes of cultivated plants [9]. As a 
result, genome-wide loss of genetic variation is found in 
cultivated plants [1]. The artificial selection targeted to 
specific desirable traits controlled by domestication genes 
also reduces the genetic diversity in crop plants as com- 
pared to their wild ancestors [10]. Many traits generally 
suitable for human needs have been targets of selection 
during the domestication of crops. These traits and associ- 
ated genes have subsequently undergone changes in re- 
sponse to selection due to local environment and cultural 
preferences (e.g., grain color, taste) [11]. Thus, analyses of 
nucleotide sequences of domestication genes at the DNA 
level are invaluable to gain insights into types of selection 
that has occurred during domestication. 

Several studies have demonstrated the selective sweep 
in domestication genes and genomic regions in domesti- 
cated crops [12-14]. Olsen et al [15] showed one to two 
fold increase in selection pressure in domestication genes 
as compared to genes under natural selection. However, 
the reduction in genetic diversity within various regions of 
selected genes may vary depending on the relevance of a 
given region for determining the trait. 

Indigenous rice varieties cultivated in the Eastern 
Himalayan region of NE India are phenotypically diverse 
and many of which are intricately associated with local 
cultural and traditional practices. One of the most im- 
portant culinary and cultural practices found throughout 
NE India is the use of glutinous rice as a food of choice 
during festival seasons [16]. Thus, along with nongluti- 
nous rice varieties, numerous glutinous rice varieties are 
widely cultivated in NE India. The glutinous and nonglu- 
tinous nature of rice is primarily determined by the 
composition of starch in the endosperm tissue. Starch in 
rice endosperm contains two types of polysaccharides 
namely amylose and amylopectin. Rice varieties with 
high amylose levels (-20-30%) tend to form discrete, 
noncohesive (non-sticky) grains when cooked, whereas 
varieties with lower amylose levels form cohesive (sticky) 
cooked grains, commonly known as glutinous [15]. Pre- 
vious studies have shown that a mutation in the Waxy 
(Wx) gene that encodes granule-bound starch synthase 
drastically reduces (<1%) synthesis of amylose in the 
endosperm of glutinous rice [17]. The point mutation 
from G to T at the 5 ' splice site of the Wx intron 1 is 
known to cause incomplete post-transcriptional process- 
ing of the pre-mRNA in glutinous rice varieties [17-19]. 
On the other hand, nonglutinous rice varieties possess 
multiple Wx alleles and shows wide variation in amylose 
content [20]. A highly variable microsatellite (CT n ) in 
the 5' untranslated exon 1 of the Wx gene is known to 
contain many alleles and the size of the allele is corre- 
lated with the amylose content in rice varieties [20,21]. 
Some nonglutinous and low- amylose containing varieties 



also known to carry the G to T mutation at the 5 ' splice 
site of Wx gene suggesting that mutation in the Wx gene 
may not necessarily be responsible for the glutinous 
phenotype [22-24]. 

Another morphological variation found among indi- 
genous rice varieties in NE India is the apiculus color- 
ation. The apiculus of the wild ancestor of cultivated 
rice, O. rufipogon, is pigmented whereas apiculus of cul- 
tivated rice varieties could be colored or colorless. The 
colored apiculus phenotype is attributable to anthocya- 
nin pigments, which are known to be associated with 
coloration in various plant parts. Anthocyanins perform 
multiple biological functions in plants including protec- 
tion against UV radiation, defense responses and signal 
molecules in plant-microbe interactions [25,26]. Saitoh 
et al [27] identified and mapped the OsCl gene in rice 
responsible for anthocyanin pigmentation and apiculus 
coloration in rice. Comparative sequence analysis re- 
vealed that colorless lines differed from their colored 
counterpart by a 10-bp deletion located in the R3 repeat 
located within the third exon of the OsCl gene [27] . 

In this study, we analyzed (a) mutations in Wx and 
OsCl genes in indigenous rice varieties in NE India, and 
their corresponding phenotypes, and (b) nucleotide di- 
versity patterns in these genes across rice varieties to 
detect selection signatures in domestication related 
genes. In contrary to expectations, we found greater 
levels of diversity at the Wx gene in glutinous varieties 
as compared to non-glutinous varieties, and low levels 
of selection in colourless apiculus varieties, suggesting 
the existence of other, as-yet unknown genes contribut- 
ing to these phenotypes. 

Methods 

Plant samples 

In the present study, altogether 29 cultivated rice var- 
ieties (including 5 agronomically improved varieties) and 
one wild rice species (O. rufipogon) from NE India were 
included (Figure 1). Two trait specific genes correspond- 
ing to contrasting phenotypes were chosen to study. The 
samples studied included five glutinous and 24 nongluti- 
nous varieties, and 8 colored apiculus and 21 colorless 
apiculus varieties (Table 1). The wild rice species (O. 
rufipogon), which is nonglutinous and colored apiculus 
was used as an outgroup. Plant morphology and grain 
characteristics were noted based on direct observation, 
interviewing the farmers in the field or records from the 
International Rice Research Institute (IRRI), Philippines. 
Seeds were germinated in Petri dishes, transferred to 
pots and grown in the greenhouse. Leaf samples from 
seedlings were harvested, air dried, and genomic DNA 
was extracted following modified cetyltrimethyl ammo- 
nium bromide extraction protocol [28,29]. 
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Figure 1 Map showing traditionally cultivated indigenous rice sampling sites in Northeast India. 



Loci studied, PCR amplification and sequencing 

We analyzed nucleotide polymorphism in two trait specific 
genes, waxy (Wx), the gene associated with granule bound 
starch synthesis and OsCl, the gene associated with antho- 
cyanin biosynthesis and apicule coloration. Nucleotide se- 
quences of oligonucleotide primers used for amplification 
and sequencing are given in Table 2. A portion of the Wx 
gene (~2.7-kb region) surrounding previously identified in- 
tron 1 splice donor site mutation, promoter sequence, en- 
tire exon 1, intron 1, the 5' end of exon 2, and the entire 
noncoding region within exon 2 (Figure 2A) were se- 
quenced following the protocol of Olsen and Purugganan 
[24]. The OsCl gene region (~1.3-kb region) (Figure 2B) 
was amplified and sequenced following Saitoh et al [27]. 

PCR amplifications were performed in an Applied Bio- 
systems thermal cycler in a total volume of 25 uL reac- 
tion mixture consisting of 0.25 mM dNTP, 2.0 mM 
MgCl 2 , 2.5 uL of 10X buffer, 1.5 pmol of each primer 



and 0.2 U Taq polymerase. The thermal cycling profiles 
as described in previous publications {Wx: [24], and 
OsCl: [27]) were followed. The amplified DNA products 
were separated through electrophoresis on 1% agarose 
gels containing with 0.33 ug/ml ethidium bromide. The 
electrophoresis was performed at 90 V for 40 minutes in 
a 24 cm long electrophoretic apparatus containing 1 X 
TBE electrode buffer. DNA fragments on agarose gels 
were visualized using an ultraviolet (302 nm) transillu- 
minator (UVP Inc), and the size of the amplified DNA 
fragments was determined using GeneRuler 1 kb DNA 
ladder (Fermentas) as a size standard. The PCR products 
were sequenced after purification using Bio-Basic PCR 
product purification kit (Bio-Basic inc.). 

Data analysis 

DNA sequence chromatograms were analyzed using the 
software program Geneious version 5.4.6 (http://www. 
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Table 1 Rice variety names, phenotype, and functional mutations at the Wx and OsCI genes 



Variety 


Grain quality 


Wx 5' splice site 


1/l/xCTn 


Apiculus color 


OsCI 


Bas Beroin 


Glutinous 


T 


17 


Colored 


No 


Til Bora 


Glutinous 


T 


17 


Colored 


No 


Ranga Borah 


Glutinous 


G 


11 


Colorless 


Yes 


Kaki beroin 


Glutinous 


G 


11 


Colorless 


Yes 


Borua Beroin 


Glutinous 


T 


17 


Colorless 


No 


Joha 


Non Glutinous 


G 


18 


Colored 


No 


Bherapawa 


Non Glutinous 


G 


17 


Colored 


No 


Lallatoi 


Non Glutinous 


G 


11 


Colored 


Yes 


Kawanglawang 


Non Glutinous 


T 


17 


Colored 


No 


Hati Hali 


Non Glutinous 


G 


18 


Colored 


No 


Balam 


Non Glutinous 


G 


11 


Colored 


No 


Bashful 


Non Glutinous 


G 


10 


Colorless 


No 


Lahi 


Non Glutinous 


G 


17 


Colorless 


No 


Borjahinga 


Non Glutinous 


G 


11 


Colorless 


No 


Moircha 


Non Glutinous 


G 


11 


Colorless 


Yes 


Aubalam 


Non Glutinous 


G 


11 


Colorless 


Yes 


Papue 


Non Glutinous 


G 


20 


Colorless 


Yes 


Sorpuma 


Non Glutinous 


G 


10 


Colorless 


Yes 


Mimutim 


Non Glutinous 


G 


18 


Colorless 


Yes 


Local Basmati 


Non Glutinous 


G 


11 


Colorless 


Yes 


Arfa 


Non Glutinous 


G 


11 


Colorless 


Yes 


Mulahail 


Non Glutinous 


G 


10 


Colorless 


Yes 


Guaroi 


Non Glutinous 


G 


17 


Colorless 


Yes 


Harinarayan 


Non Glutinous 


G 


17 


Colorless 


Yes 


Ranjit 


Non Glutinous 


G 


11 


Colorless 


Yes 


IR8 


Non Glutinous 


G 


11 


Colorless 


Yes 


Bahadur 


Non Glutinous 


G 


11 


Colorless 


Yes 


Pankaj 


Non Glutinous 


G 


12 


Colorless 


Yes 


Joy a 


Non Glutinous 


G 


11 


Colorless 


Yes 


0. rufipogon 


Non Glutinous 


G 


7 


Colored 


No 



OsCI 10 bp deletion 



Abbreviations: Wx waxy gene, CT n number of CT repeat. 



geneious.com/) and visually inspected for ambiguities. 
The resulting consensus DNA sequences were aligned 
using the software program ClustalW v2 [30]. The coding 
and non-coding regions of the gene were identified by 
comparing with annotated DNA sequences of correspond- 
ing genes downloaded from the GenBank. 

In order to examine the patterns of nucleotide diversity 
resulting from evolutionary changes in DNA sequences in 
relation to neutral expectations and signatures of selection 
due to domestication process, several analyses as de- 
scribed below were performed using the software program 
DnaSP version 5.1 [31]. The 6 W based on the number of 
segregating sites [32], tt based on mean pairwise nucleo- 
tide differences among sequences [33], Tajimas D [34], Fu 



and Lis D* and P v [35] were calculated, and McDonald 
and Kreitman [36] analysis was performed. £>* and P v are 
more sensitive than Tajimas D in detecting deviations 
from neutrality based on low-frequency polymorphisms, 
population expansion and positive selection [35]. The 
McDonald and Kreitman [36] test is insensitive to demo- 
graphic histories and geographic structuring of the popu- 
lations. Thus, use of a variety of approaches that differ in 
underlying assumptions provides a means to discern the 
historical processes associated with shaping the patterns 
of nucleotide diversity. The changes in nucleotide diversity 
and associated statistic in different regions of the gene was 
examined using the sliding-window analysis approach. 
The rates of synonymous (dS) and non-synonymous (dN) 
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Table 2 List of genes surveyed and primer sequences used in the study 



Gene name 


Primer name 


Primer sequence (5' - 30 


Functional association 


Waxy [24] 


WxU1F 


GCCGAGGGACCTAATCTGC 


Granule-bound starch synthase 




Wx1R 


TGGTGTGGGTGGCTAmGTAG 






Wx2FaF 


GCCCCGCATGTCATCGTC 






Wx2R 


GTOTCTAGCTGTOCTGTGGA 






WxIFint 


TOTCAGCACGTACAAGCA 






\A/v~)Rint 
VVXzrul 11 


(^rTATATATA I I I \CC\ I I C ACC A A 




0sC1 [27] 


OsCIFI 


ATCGCTCAGTCTCACACCGCA 


Anthocyanin biosynthesis 




0sC1F3 


GAGGGA GAATGGGGAGGAGAGC 






0sCF4 


TAATOTGATCTGTATGGATGCTG 






0sC1F5 


GATCGATCGTGTATATATGTOTCAGGT 






0sC1R6 


GTOCTGTGTCGGTGT CGGCG 






0sC1R7 


ATGGCCGTCTCCTAATOCCCTGC 






0sC1R2 


CGTACGGACGACGAACTAATGTCAC 





substitution in each of the selected genes among different 
rice types were calculated. The ratio of dN/dS provides an 
insight into the long-term selection pressure and purifying 
selection during the domestication process. Number of 
haplotypes was calculated and the haplotype network dia- 
gram was constructed using NETWORK 4.5.1 (Fluxus 
Technology Ltd. at www.fluxus-engineering.com). 

Results 

A total of 53 indel polymorphisms with an average 
length of 3.525 were detected from the two sequenced 
regions (Table 3). The size of indels varied in length and 
ranged from one to 20 nucleotides in both coding and 
noncoding regions. Single nucleotide polymorphisms 
(SNP) were more frequent than indels. Total numbers of 
SNPs found among the sequenced regions were 91 with 
an average of 1 SNP at every 44.33 nucleotides. 



Polymorphism of the Wx gene 

The aligned length, including both coding and non- 
coding regions of the Wx gene was 2770 nucleotides. A 
total of 50 indels were detected with an average length 
of 2.12 nucleotides across all samples. The exon 1 (5' 
untranslated region) of the Wx gene contained a highly 
variable microsatellite (CT n ). A total of seven alleles of 
this microsatellite (n = 7, 10, 11, 12, 17, 18, and 20) were 
detected among rice varieties included in the present 
study. Alleles CT 10 , CT n , CT 17 , and CT 18 were found in 
3, 13, 8 and 3 cultivated varieties respectively. The CT 12 
and CT 2 o alleles were found in one cultivated variety 
each. A unique CT 7 allele was found in the wild rice O. 
rufipogon. The number of SNPs was higher than the 
number of indels, with a total of 84 SNPs resulting in 
average 1 SNP for 32.98 bp among all samples. Relatively 
fewer SNP (1) and indels (6) were found in glutinous 
varieties than in the nonglutinous varieties (17 indels 
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Figure 2 The locations of the coding and non-coding regions of Wx (A) and OsC7 (B) genes. Arrows at the bottom indicate primers used 
for PCR amplification. 
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Table 3 Lengths of aligned nucleotide sequences (bp) and site categories 



Gene region 


Total 


Total no. 


No. of 


No. of indel 


Length of 


Length of 


Length of 


Length of SNP 




length 


of sites 


indels 


polymorphisms 


coding region 


coding region 


noncoding 


noncoding 




including 


excluding 






excluding 


including 


region excluding 


region including 




indels 


indels 






indels 


indels 


indels 


indels 


Waxy 


2770 


2574 


195 


50 


177 


197 


2574 


2593 84 


0sC1 


1296 


1284 


12 


3 


809 


824 


475 


476 7 



and 7 SNPs). The total number of mutations was also 
higher among the nonglutinous varieties than in the glu- 
tinous varieties (Table 3). 

The G to T mutation at the 5 ' splice donor site of the 
Wx intron 1, which is known to be associated with dras- 
tic reduction in amylose synthesis in glutinous rice var- 
ieties [17] was not consistently present among glutinous 
rice varieties included in the present study. The results 
revealed that T nucleotide was present in four varieties, 
while G nucleotide was found in the remaining 25 culti- 
vated rice varieties and in the wild rice. The T nucleotide 
was found in three of the five glutinous varieties (Borua 
Beroin, Bas Beroin and Til Bora), and G nucleotide 
was present in other two glutinous (Ranga Borah and 
Kakiberoin) varieties. On the contrary, the T nucleotide 
at this site was found in one of the nonglutinous 
(Kawanglawang) varieties. 

The nucleotide diversity analyses results showed that 
nucleotide diversity of glutinous varieties was higher 
(jitot = 0.0053; 6 tot = 0.0043) than the nonglutinous var- 
ieties (n tot = 0.0043; 6 tot = 0.0033). The sliding window 
analysis of the Wx gene revealed high nucleotide diver- 
sity at three regions located at 1 to 600, 1150 to 2000 
and 2300 to 2500 bp of the gene. This analysis further 
revealed that polymorphic sites were mostly located at 
the beginning and end of the promoter region, the exon 
1 carrying the microsatellite and the first part of intron 
1 (Figure 3). 



Neutrality analysis at the Wx locus 

The values of Tajimas D and Fu and Lis D* and P v 
based on the Wx locus were not significantly different 
from neutral expectations. The values of D or D* and P v 
were positive for glutinous and nonglutinous varieties at 
the Wx locus (Table 4), indicating a weak overdominant 
selection or population size reduction. The sliding win- 
dow analyses of Tajimas D showed that glutinous var- 
ieties had only positive values while nonglutinous 
varieties had both positive and negative values at differ- 
ent regions of the gene (Figure 4). Negative D values 
were detected in the regions between 1357-1432, 1575- 
1655, 2400-2476, 2659-2735 bp only in nonglutinous 
varieties. These regions are located in the intron- 1 and 2 
and the exon-1 of the Wx gene. The observed pattern of 
variability is not significantly different from expected 
variability under the neutral model of evolution and 
neutrality hypothesis cannot be rejected. The McDonald 
and Kreitman test did not show departure from neutral- 
ity for the glutinous and non-glutinous varieties (Table 4) 
indicating no signature of selection at the Wx locus. 

The analyses of SNPs revealed 16 distinct Wx haplo- 
types among studied rice varieties including the wild rice 
(Figure 5) and formed two distinct groups (haplotypes 
1-5 and haplotypes 6-15). One variety each consisting 
of two haplotypes (HI and H7) were glutinous type. 
Two varieties with the haplotype H2 and one variety 
with the haplotype H9 were glutinous type. The analyses 
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Figure 3 Nei's Nucleotide diversity (n) patterns along l/l/x gene in sliding window among glutinous and nonglutinous grain types. 

Analysis was performed using a window length of 50 bp and steps of 25 bp. ([fl] promoter region; | exon; [7] intron). 
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Table 4 Levels of nucleotide variation at the two studied genes 



Gene 


Ecotype 


Indel 


SNP 


S 


"tot 


©tot 


dN/dS 


D 


D* 


F* 


Wx 


Glutinous 


6 


1 


23 


0.0053 


0.0043 




1.7295 


1 .7295 


1 .8583 




Nonglutinous 


17 


7 


31 


0.0043 


0.0033 




1.1825 


0.9145 


1.369 


OsCI 


Colored 


2 


1 


6 


0.0023 


0.0020 




0.8109 


1 .0088 


1.1449 




Colorless 


3 


8 


10 


0.0010 


0.0021 


1.00 


-1.7683 


-1.2847 


-1.7178 



S, number of segregating sites; n, average number of nucleotide differences per site between two sequences [33] calculated on the total number of polymorphic 
sites (n tot ); silent sites (n si |); synonymous sites (n syn ); nonsynonymous sites (n nonsyn ); 0, Watterson's estimator of nucleotide polymorphism per base pair [32] 
calculated on the total number of polymorphic sites (0 tot ); silent sites (0 sn ); synonymous sites (0 syn ); nonsynonymous sites (0 nO nsyn); D, Tajima's D [34]; D*, Fu and 
Li's D*; P, Fu and Li's P [35]. 

Tajima's D, *Fu and Li's D* and P not significant (P > 0.10). 



based on SNPs and indels together revealed 28 haplo- 
types, and indel only analyses revealed 26 haplotypes 
among the studied samples. 

Polymorphism at the OsCI gene 

The aligned OsCI gene region was 1296 bp long and in- 
cluded both exons and introns. The results of the 
present study showed that 62% of the sequenced sam- 
ples contained the 10 bp deletion in the R3 repeat region 
of the OsCI gene known to cause a frameshift leading to 
colorless apiculus in rice [27]. In agreement with the ex- 
pected phenotype of the genotype, the 10 bp deletion 
was found in 17 colorless apiculus varieties included in 
the present study and the corresponding deletion was 
absent in seven colored apiculus varieties and O. rufipo- 
gon (Table 1). However, there were incongruences be- 
tween the genotype and the phenotype of several 
varieties examined in the present study. The 10 bp dele- 
tion was not found in four colorless apiculus varieties 
(Bashful, Borua Beroin, Lahi and Borjahinga), and the 
corresponding 10 bp deletion was found in one of the 
colored apiculus varieties (Lallatoi). 

Three non-synonymous substitutions were detected 
in the coding regions of the OsCI gene. One single 



nucleotide polymorphism (SNP) was detected in the 
exon-1 with a mutation of G to C at the position 60 
resulting in an amino acid change from positively 
charged Lysine to negatively charged Aspartic acid. An- 
other SNP was detected in the exon-1 with a mutation 
of C to G at the position 122 in the variety Bashful, 
resulting in an amino acid change of non-polar Proline 
to positively charged Arginine. The other non-synonymous 
substitution was at the position 845 in the exon 3 with a 
mutation of G to T resulting in an amino acid change of 
Alanine to Valine (both hydrophobic). Other than these, 
eight SNPs were detected in the intronic regions of the 
OsCI gene among different cultivated varieties and wild 
rice. 

The analyses of nucleotide sequences of the OsCI gene 
revealed three indels (average 3.22 bp long) and seven 
SNPs (average one SNP for every 185.14 bp) among se- 
quenced samples. More indels and SNPs were found in 
colorless apiculus varieties than in the colored apiculus 
varieties (Table 4). However, the nucleotide diversity (tt: 
[33]) was higher in the colored apiculus rice varieties 
than in the colorless apiculus varieties (Table 4). The 
sliding window analysis of the OsCI gene showed that 
parts of the intron 2 and exon 3 at 400 to 625, 800 to 
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Figure 4 Tajima's D statistics in sliding window analysis for the Wx gene among rice ecotypes and glutinous and nonglutinous rice 
varieties. Computation was performed using a window length of 50 bp and steps of 25 bp ([JJ]] promoter; | exon; \<2 intron). 
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900 and 1050 to 1250 bp are polymorphic, and the nu- 
cleotide diversity in colored apiculus varieties are higher 
than the colorless apiculus rice varieties (Figure 6). 

Neutrality analysis 

The overall values of Tajimas D and Fu and Lis D* and 
F* were negative in colorless apiculus rice varieties, and 
positive in colored apiculus varieties (Table 4). The sliding 
window analyses of Tajimas D showed mostly negative 
values in colorless apiculus varieties and mostly positive 
values in the colored apiculus rice varieties (Figure 7). 
These values were not significantly different from neu- 
tral expectations. The negative D values in colorless 
apiculus varieties were detected at 25-150, 400-475, 
525-700, 811-886 and 1161-1237 bp positions, and a 
positive value was observed at 475-525 bp position. On 
the contrary, colored apiculus varieties showed positive 
D values in most regions (400-475, 525-625, 811-886 



and 1161-1237 bp) and negative values at the 475- 
525 bp region (Figure 7). In general, the colorless apicu- 
lus varieties showed negative D values in the exon-1, 
intron-2 and exon-3, and positive D value in the intron- 
2. Interestingly, an opposite trend was observed in col- 
ored apiculus varieties with positive D values in intron-2 
and exon-2 and negative D in value in intron-2.These D 
values, which are not significantly different from neutral 
expectations indicates that neutrality hypothesis in the 
OsCl gene region cannot be rejected. The McDonald 
and Kreitman test did not show evidence of selection in 
the OsCl gene (Table 5). Altogether nine haplotypes 
were detected in the OsCl gene (Figure 8). Haplotypes 
H8 (three varieties) and H4 (one variety) were found 
only in colored apiculus varieties while haplotypes HI 
and H6 were found in both colored and colorless apicu- 
lus varieties. Other haplotypes were found only in color- 
less apiculus varieties. The analyses based on SNPs and 




0 250 500 750 1000 1250 

Nucleotide position 

Figure 6 Nei's Nucleotide diversity (n) patterns along OsCl gene in sliding window among colored and colorless apiculus rice grains 
apiculus in rice. Analysis was performed using a window length of 50 bp and steps of 25 bp. (■ exon; Q intron). 
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Nucleotide position 

Figure 7 Tajima's D statistics in sliding window analysis for the OsCI gene among the colored and colorless apiculus rice grains. 

Computation was performed using a window length of 50 bp and steps of 25 bp. (■ exon; [7]| intron). 



indels yielded 15 haplo types and analyses of indel poly- 
morphisms yielded seven haplotypes among the colored 
and colorless apiculus rice varieties. 

Discussion 

The present study reports the findings of the analyses of 
DNA sequence variability of two trait specific genes in 
indigenous rice varieties in the Eastern Himalayan region 
of NE India. The Wx gene is associated with amylose syn- 
thesis, which determines the glutinous or nonglutinous 
nature of rice grains. The OsCI gene is involved in the 
synthesis of anthocyanin and associated with coloration of 
the apiculus in rice grains. Rice varieties used in this study 
include glutinous and nonglutinous as well as colored and 
colorless apiculus types collected from a broad geographic 
area covering most of the NE India. 

The present study revealed that previously identified 
mutations do not exclusively contribute to the corre- 
sponding phenotypes in rice varieties. For example, the 
glutinous nature in most rice varieties is considered to 
be a result of a G to T mutation at the 5' splice donor 
site of exon 2 of the Wx gene [18,22]. In the present 
study, three of the five glutinous rice varieties carried 
the G to T mutation at the Wx gene, while this mutation 

Table 5 McDonald-Kreitman test for the Wx and OsCI 
genes between different types and O. rufipogon 



Locus Ecotypes and Silent Non synonymous 

grain qualities a Fjxed p 0 | ymor p hic Fixed Polymorphic 



Wx Glutinous 80 


22 


2 


2 


Nonglutinous 80 


25 


2 


3 


OsCI Red apiculus 3 


6 


1 


0 


Colorless apiculus 3 


8 


1 


2 



a Fixed differences in comparison with O. rufipogon. 



was not detected in two of the five glutinous rice var- 
ieties. On the other hand, one of the 25 non-glutinous 
rice varieties carried the G to T mutation, while main- 
taining the non-glutinous phenotypes. This finding sug- 
gests that alternative genes or genomic regions other 
than the ones previously reported are associated with 
the glutinous and nonglutinous phenotype of the culti- 
vated rice. Similarly, several reports indicated a correl- 
ation between variation in amylose content and the 
number of repeats in the microsatellite region within the 
Wx gene [37,38]. Although the present study also re- 
ports the occurrence of highly variable microsatellite 




H2 



Figure 8 Haplotype network based on OsC7 gene. 
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locus within the Wx gene, there was no direct correl- 
ation between the number of repeats and the glutinous 
nature of rice grains. 

Analyses of the OsCl locus also revealed similar pat- 
terns. The colorless apiculus in rice varieties is often at- 
tributed to a 10 bp deletion in the OsCl gene [27]. 
Although 17 of 21 varieties with colorless apiculus in- 
cluded in the present study had the 10 bp deletion in 
the OsCl gene, five varieties without the corresponding 
10 bp deletion showed the colorless phenotype. Simi- 
larly, eight varieties without the 10 bp deletion showed 
colored apiculus phenotype as expected, whereas one of 
the varieties with the 10 bp deletion showed the colored 
apiculus phenotype. Thus, apiculus color phenotype of 
18% of indigenous rice varieties in NE India did not cor- 
respond to the reported apiculus color determining 
genotype of the OsCl gene. 

One of the varieties with colorless apiculus phenotype 
(Mimutim) had the 10 bp deletion in the R3 region, and 
showed the G to C nucleotide change resulting a substi- 
tution from Lysine to Aspartic acid possibly contributing 
to the observed colorless phenotype. Another colorless 
apiculus variety (Bashful) without the 10 bp deletion 
showed an amino acid change from Proline to Arginine 
in exon-1 suggesting that this mutation could be associ- 
ated with the coloration of the apiculus. However, the 
other three colorless apiculus varieties (Borua Beroin, 
Lahi and Borjahinga), which lack the 10 bp deletion in 
exon-3, did not carry the Proline to Arginine amino acid 
change suggesting that other genomic regions also play a 
role in determination of the phenotype of the apiculus 
color. The mutation at the position 845 of the exon-3, 
which substitutes Alanine to Valine in three varieties 
and (Tilbora, Kawanglawang and Balam) and O. rufipo- 
gon showed no effect on the phenotype of the apiculus 
color, suggesting that the substitution of an amino acid 
with similar hydrophobicity at this position does not 
affect the apiculus color phenotype. Overall, these obser- 
vations suggest that multiple genomic regions are in- 
volved in determining a particular phenotype. There are 
several examples of involvement of multiple genes or 
interacting loci in determination of the phenotype 
[24,39,40]. Two of the SNPs, C to G mutation at pos- 
ition 122 in exon 1 and G to T mutation at position 845, 
have already been identified in a previous study [27]. 
The G to C mutation at position 60 in exon 1 is reported 
for the first time in this study. 

It is generally considered that the domestication 
process reduces the nucleotide diversity at domestication 
related genes that control specific traits selected during 
the domestication. In other words, genes that regulate a 
particular trait under positive selection during domesti- 
cation and improvement process may imprint signatures 
of selection' in the form of typical patterns of reduced 



nucleotide diversity [10]. This is evidenced by much 
lower levels of nucleotide diversity among glutinous rice 
at the Wx gene as compared to the nonglutinous rice 
varieties [24,41]. Similar observations of reduced levels 
of nucleotide sequence polymorphism in the nonshatter- 
ing sh4 allele in the cultivated rice varieties as compared 
to wild progenitors [42], and reduced diversity in the 
ramosal gene in cultivated maize as compared to the wild 
teosintes that control branching architecture in the 
tassel and ear [43] have been reported. However, the 
present study revealed higher levels of nucleotide diver- 
sity (n tot = 0.0053) in the glutinous type varieties than in 
the nonglutinous type varieties (n to t = 0.0043) at the Wx 
locus. This could be attributable to the fact that Wx 
gene, which has been associated with the glutinous na- 
ture of rice, may not be the sole gene that determines 
the glutinous phenotype. This phenotype is likely con- 
trolled by multiple loci. This finding is is further sup- 
ported by the fact that the Wx intron 1 splice donor site 
mutation (G to T) is also found in some nonglutinous 
rice varieties reflecting that this mutation is not necessar- 
ily responsible for the expression of glutinous pheno- 
type [5,44]. These findings are in agreement with other 
studies, which showed that interaction of other genes 
(e.g. dull genes) may modify the phenotype of the Wx 
gene [45] or other dull genes [46]. Teng et al. [47] sug- 
gested that allelic variation at Wx gene may not necessar- 
ily regulate the starch properties in different rice varieties. 
The linkage association study also showed an interplay of 
multiple genes in determining starch physicochemical 
properties in rice [48] . 

Although selective sweeps may drastically reduce nu- 
cleotide diversity in target genes such as Wx locus [15], 
the diversifying selection due to environmental heterogen- 
eity and local cultural preferences favoring other traits 
may increase nucleotide diversity [49]. The existence of di- 
verse agroclimatic conditions, and various cultural prac- 
tices of indigenous communities may have played a 
significant role in the maintenance of high levels of diver- 
sity in glutinous varieties of rice in NE India. 

In the present study, positive values of Tajima D values 
were detected for the glutinous and non-glutinous var- 
ieties (Table 4) except for small regions of the Wx gene 
that showed negative values among nonglutinous var- 
ieties (Figure 4). Since the values of Tajimas D were not 
significantly different from zero, the overall distribution 
of nucleotide diversity falls within the neutral expecta- 
tions (Table 4). Since demographic changes including 
population expansion or reduction may influence all re- 
gions of the genome equally, the differences in Tajima D 
within and between loci could be attributable to selec- 
tion trends during the domestication process. Therefore, 
regions of the gene that shows positive Tajima D value 
could be attributable to balancing or overdominant 
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selection, whereas the regions of gene with negative 
Tajima D value could be associated with the purifying 
selection. Signature of positive selection shown in 
McDonald and Kreitman test at the Wx gene may be 
linked to some traits of ecological adaptation into di- 
verse agroclimatic conditions. The deviations detected in 
various analyses are not significantly different from neu- 
tral expectations and conforms that selection pressure 
associated with both traits are weak. Similar results have 
also been reported in previous studies in rice [24] and 
maize [13,14]. The total of 16 haplotypes detected at the 
Wx locus is lower than the previously reported 18 haplo- 
types among 37 glutinous and 68 nonglutinous rice ac- 
cessions from Asia [24]. However, the 16 haplotypes 
reported in our study are different than haplotypes 
found in the previous study. There was no clear haplo- 
type based partitioning of the rice varieties into glutin- 
ous and nonglutinous varieties. Haplotype analysis based 
on Wx locus showed that haplotypes HI to H5 formed a 
distinct cluster consisting of only indigenous varieties 
and could serve as a valuable material for future genetic 
improvement programs. Although number of haplotypes 
varied when indels were considered in the network ana- 
lysis, there was no clear grouping based on phenotypes. 

The OsCl gene showed lower levels of polymorphism 
and reduced nucleotide diversity among the colorless 
apiculus varieties as compared to colored apiculus var- 
ieties. The low level of nucleotide diversity is common 
in genes related to selected phenotypes [24,42]. Sliding 
window analysis of the nucleotide diversity showed that 
most regions of reduced nucleotide diversity in OsCl 
gene were same between colored and colorless apiculus 
phenotypes (Figure 6). Such concordant loss of diversity 
could be attributable to population bottleneck during 
the domestication [50]. 

The evidence for selection among colorless apiculus 
varieties is detected through high dN/dS ratio at the 
OsCl locus (Table 4). As this gene is associated with 
synthesis of anthocyanins, which has multiple functions 
including plant defense responses and signalling in 
plant-microbe interactions [25,26], selection of this gene 
among the cultivated rice varieties can not be ruled out. 
The negative values of the Tajima D values indicate an 
excess of rare alleles (Table 4) at the OsCl locus among 
the colorless apiculus varieties suggesting a possibility of 
purifying selection. It has been found that colorless api- 
culus varieties possessed more negative D values in the 
coding regions compared to the colored apiculus coun- 
terpart. These patterns are consistent with a recent se- 
lective sweep at the OsCl gene among the colorless 
apiculus rice varieties. Translation of the coding regions 
of OsCl gene revealed that the sequences with the 10-bp 
deletion within the third exon drastically reduces the 
protein size from 272 amino acid to 206 amino acid. 



This might have significant impact in expression of the 
OsCl gene and regulation of apiculus coloration in rice. 

The haplotype analysis revealed nine different haplo- 
types among the colored and colorless apiculus varieties. 
The number of detected haplotypes is about 50% less than 
the previously reported haplotypes (17) among 39 wild 
and cultivated rice [27] . On the other hand, only two hap- 
lotypes reported in Saitoh et al. [27] were detected in our 
samples and the remaining seven haplotypes were unique 
to our study. These haplotypes formed two major groups 
of rice varieties. However this grouping did not corres- 
pond to apiculus coloration. Similar results were also ob- 
tained when gaps were included in in the analysis. One 
group showed affinity with the agronomically improved 
varieties and the other group consisting of only indigenous 
varieties formed a separate cluster. 

Conclusion 

The present study based on two trait specific genes, Wx 
and OsCl reported to be associated with amylose content 
and apiculus coloration respectively, showed that muta- 
tions considered to be associated with a given phenotype 
of the trait do not necessarily correspond to those pheno- 
types in indigenous rice varieties in NE India. This sug- 
gests that alternative genomic regions also involved in 
controlling the amylose content and apiculus coloration in 
rice. Although statistically significant signatures of selec- 
tion were not detected in both genes, low level of selection 
that varied across the length of each gene was evident. 

Availability of supporting data: Nucleotide sequences 
reported in this paper has been submitted with the 
GenBank with accession numbers KJ934819 - KJ934878. 
The sequences have also been submitted to LabArchives 
and can be accessed from the following link (DOI 10.6070/ 
H4H41PDH). 
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